On the assessment of the added value of new predictive biomarkers
نویسندگان
چکیده
BACKGROUND The surge in biomarker development calls for research on statistical evaluation methodology to rigorously assess emerging biomarkers and classification models. Recently, several authors reported the puzzling observation that, in assessing the added value of new biomarkers to existing ones in a logistic regression model, statistical significance of new predictor variables does not necessarily translate into a statistically significant increase in the area under the ROC curve (AUC). Vickers et al. concluded that this inconsistency is because AUC "has vastly inferior statistical properties," i.e., it is extremely conservative. This statement is based on simulations that misuse the DeLong et al. method. Our purpose is to provide a fair comparison of the likelihood ratio (LR) test and the Wald test versus diagnostic accuracy (AUC) tests. DISCUSSION We present a test to compare ideal AUCs of nested linear discriminant functions via an F test. We compare it with the LR test and the Wald test for the logistic regression model. The null hypotheses of these three tests are equivalent; however, the F test is an exact test whereas the LR test and the Wald test are asymptotic tests. Our simulation shows that the F test has the nominal type I error even with a small sample size. Our results also indicate that the LR test and the Wald test have inflated type I errors when the sample size is small, while the type I error converges to the nominal value asymptotically with increasing sample size as expected. We further show that the DeLong et al. method tests a different hypothesis and has the nominal type I error when it is used within its designed scope. Finally, we summarize the pros and cons of all four methods we consider in this paper. SUMMARY We show that there is nothing inherently less powerful or disagreeable about ROC analysis for showing the usefulness of new biomarkers or characterizing the performance of classification models. Each statistical method for assessing biomarkers and classification models has its own strengths and weaknesses. Investigators need to choose methods based on the assessment purpose, the biomarker development phase at which the assessment is being performed, the available patient data, and the validity of assumptions behind the methodologies.
منابع مشابه
Reducing Measurement Error in Nutrition Assessment: Potential Research Implications for Iran
Self-reported measures of dietary intake are prone to measurement error that may obscure the relationship of diet and disease. This review addresses strategies to decrease errors during collection of dietary data and statistical approaches to deal with measurement issues once the data are collected. Examples from two US studies-- the Women’s Health Initiative (WHI) Dietary Modificat...
متن کاملLetter to Editor: Positive predictive value of diabetes mellitus risk assessment
Diabetes mellitus (DM) is an important public health challenge [1 ].Different studies predicted that the frequency of diabetic patients will be increased to 642 million throughout the world by 2040 [2]. A notable percentage of diabetic patients are not aware of their disease (approximately 30% in Iran) [3]. Lag in the diagnosis of DM raises the expense of controlling disease and makes the progn...
متن کاملEvaluation the Effect of Financial and Non-financial Variables on the Economic Value added of Iranian Banks
Objective: The economic value added of banks is a new approach to assessing the performance of banks that, while not having accounting problems in profit and loss statements, can predict the financial health of banks. Unlike developed countries, the most important measure of financial performance in Iranian banks is net profit, but modern financial theory seeks to maximize value rather than max...
متن کاملFirst-trimester Combined Screening for Trisomies 21, 18, and 13 by Three Closed Chemiluminescence Immunoassay Analyzers (an Experiment on Iranian Pregnant Women)
Background: Pregnancy-associated plasma protein-A (PAPP-A) and free β-human chorionic gonadotropin (free β-hCG) as valuable biochemical biomarkers are used to screen down syndrome, Edwards syndrome, and Patau syndrome in the first trimester of pregnancy. Closed immunoassay analyzers are regarded as sophisticated platforms to measure biochemical biomarkers. This study compared the performance of...
متن کاملA Suitable Innovation Type to Improve the Added Value in Entrepreneurship
Due to the tough on-going competition between producers of commodities and services, the prices of products are continuously going down nowadays. To avoid bankruptcy in such circumstances, organizations have focused on the application of technological innovations that enable them to respond to the needs of their clients. It is to this end that they attempt to gain the market and increase their ...
متن کاملEvaluation of the Predictive Value of Umbilical Cord Serum Bilirubin Level for the Development of Subsequent Hyperbilirubinemia in Term and Late-Preterm Neonates
Background: Considering the increasing rates of early hospital discharge and kernicterus in healthy full term newborns, timely identification of neonates at risk of severe hyperbilirubinemia is of great significance. The aim of this study was to investigate the predictive value of umbilical cord serum (UCS) bilirubin level for subsequent hyperbilirubinemia. Moreover, we compared the predictive ...
متن کامل